HMM-based scost quality control for unit selection speech synthesis
نویسندگان
چکیده
This paper describes the implementation of a unit selection text-to-speech system that incorporates a statistical model Cost (sCost), in addition to target and join costs, for controlling the selection of unit candidates. sCost, a quality control measure, is calculated off-line for each unit by comparing HMM based synthesis and recorded speech with their corresponding unit segment labels. Dynamic time warping (DTW) is used to perform such comparison at level of spectrum, pitch and voice strengths. The method has been tested on unit selection voices created using audio book data. Preliminary results indicate that the use of sCost based only on spectrum introduce more variety on style pronunciation but affects quality; whereas using sCost based on spectrum, pitch and voicing strengths improves significantly the quality, maintaining a more stable narrative style.
منابع مشابه
Evaluation of Finnish unit selection and HMM-based speech synthesis
Unit selection and hidden Markov model (HMM) based synthesis have become the dominant techniques in text-to-speech (TTS) research. In this work, we combine HMM-based signal generation with the front end originally designed for unit selection based Finnish TTS and we evaluate the prosody of the output generated by the two synthesis techniques using the same speech database. Furthermore, we study...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملAnalysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech
We have applied two state-of-the-art speech synthesis techniques (unit selection and HMM-based synthesis) to the synthesis of emotional speech. A series of carefully designed perceptual tests to evaluate speech quality, emotion identification rates and emotional strength were used for the six emotions which we recorded – happiness, sadness, anger, surprise, fear, disgust. For the HMM-based meth...
متن کاملAcoustic model selection and voice quality assessment for HMM-based Mandarin speech synthesis
This paper presents a preliminary study in implementing HMM-based Mandarin speech synthesis system, whose main advantage exists in generating various voices. A variety of acoustic unit representations for Mandarin are compared to select an optimal acoustic model set. Syllabic vs. sub-syllabic, context-independent vs. context-dependent, toneless vs. tonal, initial-final vs. preme-toneme models, ...
متن کاملTying covariance matrices to reduce the footprint of HMM-based speech synthesis systems
This paper proposes a technique of reducing footprint of HMMbased speech synthesis systems by tying all covariance matrices. HMM-based speech synthesis systems usually consume smaller footprint than unit-selection synthesis systems because statistics rather than speech waveforms are stored. However, further reduction is essential to put them on embedded devices which have very small memory. Acc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013